192 research outputs found
Solving Large Scale Quadratic Constrained Basis Pursuit
Inspired by alternating direction method of multipliers and the idea of
operator splitting, we propose a efficient algorithm for solving large-scale
quadratically constrained basis pursuit. Experimental results show that the
proposed algorithm can achieve 50~~100 times speedup when compared with the
baseline interior point algorithm implemented in CVX.Comment: 5 pages, 1 figur
An Information-Theoretic Explanation for the Adversarial Fragility of AI Classifiers
We present a simple hypothesis about a compression property of artificial
intelligence (AI) classifiers and present theoretical arguments to show that
this hypothesis successfully accounts for the observed fragility of AI
classifiers to small adversarial perturbations. We also propose a new method
for detecting when small input perturbations cause classifier errors, and show
theoretical guarantees for the performance of this detection method. We present
experimental results with a voice recognition system to demonstrate this
method. The ideas in this paper are motivated by a simple analogy between AI
classifiers and the standard Shannon model of a communication system.Comment: 5 page
Necessary and Sufficient Null Space Condition for Nuclear Norm Minimization in Low-Rank Matrix Recovery
Low-rank matrix recovery has found many applications in science and
engineering such as machine learning, signal processing, collaborative
filtering, system identification, and Euclidean embedding. But the low-rank
matrix recovery problem is an NP hard problem and thus challenging. A commonly
used heuristic approach is the nuclear norm minimization. In [12,14,15], the
authors established the necessary and sufficient null space conditions for
nuclear norm minimization to recover every possible low-rank matrix with rank
at most r (the strong null space condition). In addition, in [12], Oymak et al.
established a null space condition for successful recovery of a given low-rank
matrix (the weak null space condition) using nuclear norm minimization, and
derived the phase transition for the nuclear norm minimization. In this paper,
we show that the weak null space condition in [12] is only a sufficient
condition for successful matrix recovery using nuclear norm minimization, and
is not a necessary condition as claimed in [12]. In this paper, we further give
a weak null space condition for low-rank matrix recovery, which is both
necessary and sufficient for the success of nuclear norm minimization. At the
core of our derivation are an inequality for characterizing the nuclear norms
of block matrices, and the conditions for equality to hold in that inequality.Comment: 17 pages, 0 figure
Trust but Verify: An Information-Theoretic Explanation for the Adversarial Fragility of Machine Learning Systems, and a General Defense against Adversarial Attacks
Deep-learning based classification algorithms have been shown to be
susceptible to adversarial attacks: minor changes to the input of classifiers
can dramatically change their outputs, while being imperceptible to humans. In
this paper, we present a simple hypothesis about a feature compression property
of artificial intelligence (AI) classifiers and present theoretical arguments
to show that this hypothesis successfully accounts for the observed fragility
of AI classifiers to small adversarial perturbations. Drawing on ideas from
information and coding theory, we propose a general class of defenses for
detecting classifier errors caused by abnormally small input perturbations. We
further show theoretical guarantees for the performance of this detection
method. We present experimental results with (a) a voice recognition system,
and (b) a digit recognition system using the MNIST database, to demonstrate the
effectiveness of the proposed defense methods. The ideas in this paper are
motivated by a simple analogy between AI classifiers and the standard Shannon
model of a communication system.Comment: 44 Pages, 2 Theorems, 35 Figures, 29 Tables. arXiv admin note:
substantial text overlap with arXiv:1901.0941
Derivation of Information-Theoretically Optimal Adversarial Attacks with Applications to Robust Machine Learning
We consider the theoretical problem of designing an optimal adversarial
attack on a decision system that maximally degrades the achievable performance
of the system as measured by the mutual information between the degraded signal
and the label of interest. This problem is motivated by the existence of
adversarial examples for machine learning classifiers. By adopting an
information theoretic perspective, we seek to identify conditions under which
adversarial vulnerability is unavoidable i.e. even optimally designed
classifiers will be vulnerable to small adversarial perturbations. We present
derivations of the optimal adversarial attacks for discrete and continuous
signals of interest, i.e., finding the optimal perturbation distributions to
minimize the mutual information between the degraded signal and a signal
following a continuous or discrete distribution. In addition, we show that it
is much harder to achieve adversarial attacks for minimizing mutual information
when multiple redundant copies of the input signal are available. This provides
additional support to the recently proposed ``feature compression" hypothesis
as an explanation for the adversarial vulnerability of deep learning
classifiers. We also report on results from computational experiments to
illustrate our theoretical results.Comment: 16 pages, 5 theorems, 6 figure
Separation-Free Super-Resolution from Compressed Measurements is Possible: an Orthonormal Atomic Norm Minimization Approach
We consider the problem of recovering the superposition of distinct
complex exponential functions from compressed non-uniform time-domain samples.
Total Variation (TV) minimization or atomic norm minimization was proposed in
the literature to recover the frequencies or the missing data. However, it
is known that in order for TV minimization and atomic norm minimization to
recover the missing data or the frequencies, the underlying frequencies are
required to be well-separated, even when the measurements are noiseless. This
paper shows that the Hankel matrix recovery approach can super-resolve the
complex exponentials and their frequencies from compressed non-uniform
measurements, regardless of how close their frequencies are to each other. We
propose a new concept of orthonormal atomic norm minimization (OANM), and
demonstrate that the success of Hankel matrix recovery in separation-free
super-resolution comes from the fact that the nuclear norm of a Hankel matrix
is an orthonormal atomic norm. More specifically, we show that, in traditional
atomic norm minimization, the underlying parameter values be
well separated to achieve successful signal recovery, if the atoms are changing
continuously with respect to the continuously-valued parameter. In contrast,
for the OANM, it is possible the OANM is successful even though the original
atoms can be arbitrarily close.
As a byproduct of this research, we provide one matrix-theoretic inequality
of nuclear norm, and give its proof from the theory of compressed sensing.Comment: 39 page
Low-Cost and High-Throughput Testing of COVID-19 Viruses and Antibodies via Compressed Sensing: System Concepts and Computational Experiments
Coronavirus disease 2019 (COVID-19) is an ongoing pandemic infectious disease
outbreak that has significantly harmed and threatened the health and lives of
millions or even billions of people. COVID-19 has also negatively impacted the
social and economic activities of many countries significantly. With no
approved vaccine available at this moment, extensive testing of COVID-19
viruses in people are essential for disease diagnosis, virus spread
confinement, contact tracing, and determining right conditions for people to
return to normal economic activities. Identifying people who have antibodies
for COVID-19 can also help select persons who are suitable for undertaking
certain essential activities or returning to workforce. However, the
throughputs of current testing technologies for COVID-19 viruses and antibodies
are often quite limited, which are not sufficient for dealing with COVID-19
viruses' anticipated fast oscillating waves of spread affecting a significant
portion of the earth's population.
In this paper, we propose to use compressed sensing (group testing can be
seen as a special case of compressed sensing when it is applied to COVID-19
detection) to achieve high-throughput rapid testing of COVID-19 viruses and
antibodies, which can potentially provide tens or even more folds of speedup
compared with current testing technologies. The proposed compressed sensing
system for high-throughput testing can utilize expander graph based compressed
sensing matrices developed by us \cite{Weiyuexpander2007}.Comment: 11 page
Fast dose optimization for rotating shield brachytherapy
Purpose: To provide a fast computational method, based on the proximal graph
solver (POGS) - a convex optimization solver using the alternating direction
method of multipliers (ADMM), for calculating an optimal treatment plan in
rotating shield brachytherapy (RSBT). RSBT treatment planning has more degrees
of freedom than conventional high-dose-rate brachytherapy (HDR-BT) due to the
addition of emission direction, and this necessitates a fast optimization
technique to enable clinical usage. // Methods: The multi-helix RSBT (H-RSBT)
delivery technique was considered with five representative cervical cancer
patients. Treatment plans were generated for all patients using the POGS method
and the previously considered commercial solver IBM CPLEX. The rectum, bladder,
sigmoid, high-risk clinical target volume (HR-CTV), and HR-CTV boundary were
the structures considered in our optimization problem, called the asymmetric
dose-volume optimization with smoothness control. Dose calculation resolution
was 1x1x3 mm^3 for all cases. The H-RSBT applicator has 6 helices, with 33.3 mm
of translation along the applicator per helical rotation and 1.7 mm spacing
between dwell positions, yielding 17.5 degree emission angle spacing per 5 mm
along the applicator.// Results: For each patient, HR-CTV D90, HR-CTV D100,
rectum D2cc, sigmoid D2cc, and bladder D2cc matched within 1% for CPLEX and
POGS. Also, we obtained similar EQD2 figures between CPLEX and POGS. POGS was
around 18 times faster than CPLEX. Over all patients, total optimization times
were 32.1-65.4 seconds for CPLEX and 2.1-3.9 seconds for POGS. // Conclusions:
POGS substantially reduced treatment plan optimization time around 18 times for
RSBT with similar HR-CTV D90, OAR D2cc values, and EQD2 figure relative to
CPLEX, which is significant progress toward clinical translation of RSBT. POGS
is also applicable to conventional HDR-BT.Comment: 9 pages, 3 figure
Optimal Pooling Matrix Design for Group Testing with Dilution (Row Degree) Constraints
In this paper, we consider the problem of designing optimal pooling matrix
for group testing (for example, for COVID-19 virus testing) with the constraint
that no more than samples can be pooled together, which we call "dilution
constraint". This problem translates to designing a matrix with elements being
either 0 or 1 that has no more than '1's in each row and has a certain
performance guarantee of identifying anomalous elements. We explicitly give
pooling matrix designs that satisfy the dilution constraint and have
performance guarantees of identifying anomalous elements, and prove their
optimality in saving the largest number of tests, namely showing that the
designed matrices have the largest width-to-height ratio among all
constraint-satisfying 0-1 matrices.Comment: group testing design, COVID-1
Error Correction Codes for COVID-19 Virus and Antibody Testing: Using Pooled Testing to Increase Test Reliability
We consider a novel method to increase the reliability of COVID-19 virus or
antibody tests by using specially designed pooled testings. Instead of testing
nasal swab or blood samples from individual persons, we propose to test
mixtures of samples from many individuals. The pooled sample testing method
proposed in this paper also serves a different purpose: for increasing test
reliability and providing accurate diagnoses even if the tests themselves are
not very accurate. Our method uses ideas from compressed sensing and
error-correction coding to correct for a certain number of errors in the test
results. The intuition is that when each individual's sample is part of many
pooled sample mixtures, the test results from all of the sample mixtures
contain redundant information about each individual's diagnosis, which can be
exploited to automatically correct for wrong test results in exactly the same
way that error correction codes correct errors introduced in noisy
communication channels. While such redundancy can also be achieved by simply
testing each individual's sample multiple times, we present simulations and
theoretical arguments that show that our method is significantly more efficient
in increasing diagnostic accuracy. In contrast to group testing and compressed
sensing which aim to reduce the number of required tests, this proposed error
correction code idea purposefully uses pooled testing to increase test
accuracy, and works not only in the "undersampling" regime, but also in the
"oversampling" regime, where the number of tests is bigger than the number of
subjects. The results in this paper run against traditional beliefs that, "even
though pooled testing increased test capacity, pooled testings were less
reliable than testing individuals separately."Comment: 14 pages, 15 figure
- …